functional requirement
RBT4DNN: Requirements-based Testing of Neural Networks
Mozumder, Nusrat Jahan, Toledo, Felipe, Dola, Swaroopa, Dwyer, Matthew B.
Testing allows developers to determine whether a system functions as expected. When such systems include deep neural networks (DNNs), Testing becomes challenging, as DNNs approximate functions for which the formalization of functional requirements is intractable. This prevents the application of well-developed approaches to requirements-based testing to DNNs. To address this, we propose a requirements-based testing method (RBT4DNN) that uses natural language requirements statements. These statements use a glossary of terms to define a semantic feature space that can be leveraged for test input generation. RBT4DNN formalizes preconditions of functional requirements as logical combinations of those semantic features. Training data matching these feature combinations can be used to fine-tune a generative model to reliably produce test inputs satisfying the precondition. Executing these tests on a trained DNN enables comparing its output to the expected requirement postcondition behavior. We propose two use cases for RBT4DNN: (1) given requirements defining DNN correctness properties, RBT4DNN comprises a novel approach for detecting faults, and (2) during development, requirements-guided exploration of model behavior can provide developers with feedback on model generalization. Our further evaluation shows that RBT4DNN-generated tests are realistic, diverse, and aligned with requirement preconditions, enabling targeted analysis of model behavior and effective fault detection.
Application Modernization with LLMs: Addressing Core Challenges in Reliability, Security, and Quality
Ponnusamy, Ahilan Ayyachamy Nadar
AI-assisted code generation tools have revolutionized software development, offering unprecedented efficiency and scalability. However, multiple studies have consistently highlighted challenges such as security vulnerabilities, reliability issues, and inconsistencies in the generated code. Addressing these concerns is crucial to unlocking the full potential of this transformative technology. While advancements in foundational and code-specialized language models have made notable progress in mitigating some of these issues, significant gaps remain, particularly in ensuring high-quality, trustworthy outputs. This paper builds upon existing research on leveraging large language models (LLMs) for application modernization. It explores an opinionated approach that emphasizes two core capabilities of LLMs: code reasoning and code generation. The proposed framework integrates these capabilities with human expertise to tackle application modernization challenges effectively. It highlights the indispensable role of human involvement and guidance in ensuring the success of AI-assisted processes. To demonstrate the framework's utility, this paper presents a detailed case study, walking through its application in a real-world scenario. The analysis includes a step-by-step breakdown, assessing alternative approaches where applicable. This work aims to provide actionable insights and a robust foundation for future research in AI-driven application modernization. The reference implementation created for this paper is available on GitHub.
Multi-Objective Reinforcement Learning for Critical Scenario Generation of Autonomous Vehicles
Wu, Jiahui, Lu, Chengjie, Arrieta, Aitor, Ali, Shaukat
Autonomous vehicles (AVs) make driving decisions without human intervention. Therefore, ensuring AVs' dependability is critical. Despite significant research and development in AV development, their dependability assurance remains a significant challenge due to the complexity and unpredictability of their operating environments. Scenario-based testing evaluates AVs under various driving scenarios, but the unlimited number of potential scenarios highlights the importance of identifying critical scenarios that can violate safety or functional requirements. Such requirements are inherently interdependent and need to be tested simultaneously. To this end, we propose MOEQT, a novel multi-objective reinforcement learning (MORL)-based approach to generate critical scenarios that simultaneously test interdependent safety and functional requirements. MOEQT adapts Envelope Q-learning as the MORL algorithm, which dynamically adapts multi-objective weights to balance the relative importance between multiple objectives. MOEQT generates critical scenarios to violate multiple requirements through dynamically interacting with the AV environment, ensuring comprehensive AV testing. We evaluate MOEQT using an advanced end-to-end AV controller and a high-fidelity simulator and compare MOEQT with two baselines: a random strategy and a single-objective RL with a weighted reward function. Our evaluation results show that MOEQT achieved an overall better performance in identifying critical scenarios for violating multiple requirements than the baselines.
F -- A Model of Events based on the Foundational Ontology DOLCE+DnS Ultralite
Scherp, Ansgar, Franz, Thomas, Saathoff, Carsten, Staab, Steffen
The lack of a formal model of events hinders interoperability in distributed event-based systems. In this paper, we present a formal model of events, called Event-Model-F. The model is based on the foundational ontology DOLCE+DnS Ultralite (DUL) and provides comprehensive support to represent time and space, objects and persons, as well as mereological, causal, and correlative relationships between events. In addition, the Event-Model-F provides a flexible means for event composition, modeling event causality and event correlation, and representing different interpretations of the same event. The Event-Model-F is developed following the pattern-oriented approach of DUL, is modularized in different ontologies, and can be easily extended by domain specific ontologies.
Grand Challenges in the Verification of Autonomous Systems
Leahy, Kevin, Asgari, Hamid, Dennis, Louise A., Feather, Martin S., Fisher, Michael, Ibanez-Guzman, Javier, Logan, Brian, Olszewska, Joanna I., Redfield, Signe
Autonomous systems use independent decision-making with only limited human intervention to accomplish goals in complex and unpredictable environments. As the autonomy technologies that underpin them continue to advance, these systems will find their way into an increasing number of applications in an ever wider range of settings. If we are to deploy them to perform safety-critical or mission-critical roles, it is imperative that we have justified confidence in their safe and correct operation. Verification is the process by which such confidence is established. However, autonomous systems pose challenges to existing verification practices. This paper highlights viewpoints of the Roadmap Working Group of the IEEE Robotics and Automation Society Technical Committee for Verification of Autonomous Systems, identifying these grand challenges, and providing a vision for future research efforts that will be needed to address them.
Luban: Building Open-Ended Creative Agents via Autonomous Embodied Verification
Guo, Yuxuan, Peng, Shaohui, Guo, Jiaming, Huang, Di, Zhang, Xishan, Zhang, Rui, Hao, Yifan, Li, Ling, Tian, Zikang, Gao, Mingju, Li, Yutai, Gan, Yiming, Liang, Shuai, Zhang, Zihao, Du, Zidong, Guo, Qi, Hu, Xing, Chen, Yunji
Building open agents has always been the ultimate goal in AI research, and creative agents are the more enticing. Existing LLM agents excel at long-horizon tasks with well-defined goals (e.g., `mine diamonds' in Minecraft). However, they encounter difficulties on creative tasks with open goals and abstract criteria due to the inability to bridge the gap between them, thus lacking feedback for self-improvement in solving the task. In this work, we introduce autonomous embodied verification techniques for agents to fill the gap, laying the groundwork for creative tasks. Specifically, we propose the Luban agent target creative building tasks in Minecraft, which equips with two-level autonomous embodied verification inspired by human design practices: (1) visual verification of 3D structural speculates, which comes from agent synthesized CAD modeling programs; (2) pragmatic verification of the creation by generating and verifying environment-relevant functionality programs based on the abstract criteria. Extensive multi-dimensional human studies and Elo ratings show that the Luban completes diverse creative building tasks in our proposed benchmark and outperforms other baselines ($33\%$ to $100\%$) in both visualization and pragmatism. Additional demos on the real-world robotic arm show the creation potential of the Luban in the physical world.
Distilling Functional Rearrangement Priors from Large Models
Zeng, Yiming, Wu, Mingdong, Yang, Long, Zhang, Jiyao, Ding, Hao, Cheng, Hui, Dong, Hao
Object rearrangement, a fundamental challenge in robotics, demands versatile strategies to handle diverse objects, configurations, and functional needs. To achieve this, the AI robot needs to learn functional rearrangement priors in order to specify precise goals that meet the functional requirements. Previous methods typically learn such priors from either laborious human annotations or manually designed heuristics, which limits scalability and generalization. In this work, we propose a novel approach that leverages large models to distill functional rearrangement priors. Specifically, our approach collects diverse arrangement examples using both LLMs and VLMs and then distills the examples into a diffusion model. During test time, the learned diffusion model is conditioned on the initial configuration and guides the positioning of objects to meet functional requirements. In this manner, we create a handshaking point that combines the strengths of conditional generative models and large models. Extensive experiments on multiple domains, including real-world scenarios, demonstrate the effectiveness of our approach in generating compatible goals for object rearrangement tasks, significantly outperforming baseline methods.
Form follows Function: Text-to-Text Conditional Graph Generation based on Functional Requirements
Zachares, Peter A., Hovhannisyan, Vahan, Mosca, Alan, Gal, Yarin
This work focuses on the novel problem setting of generating graphs conditioned on a description of the graph's functional requirements in a downstream task. We pose the problem as a text-to-text generation problem and focus on the approach of fine-tuning a pretrained large language model (LLM) to generate graphs. We propose an inductive bias which incorporates information about the structure of the graph into the LLM's generation process by incorporating message passing layers into an LLM's architecture. To evaluate our proposed method, we design a novel set of experiments using publicly available and widely studied molecule and knowledge graph data sets. Results suggest our proposed approach generates graphs which more closely meet the requested functional requirements, outperforming baselines developed on similar tasks by a statistically significant margin.
Functional requirements to mitigate the Risk of Harm to Patients from Artificial Intelligence in Healthcare
Garcรญa-Gรณmez, Juan M., Blanes-Selva, Vicent, Cenzano, Josรฉ Carlos de Bartolomรฉ, Cebolla-Cornejo, Jaime, Doรฑate-Martรญnez, Ascensiรณn
The Directorate General for Parliamentary Research Services of the European Parliament has prepared a report to the Members of the European Parliament where they enumerate seven main risks of Artificial Intelligence (AI) in medicine and healthcare: patient harm due to AI errors, misuse of medical AI tools, bias in AI and the perpetuation of existing inequities, lack of transparency, privacy and security issues, gaps in accountability, and obstacles in implementation. In this study, we propose fourteen functional requirements that AI systems may implement to reduce the risks associated with their medical purpose: AI passport, User management, Regulation check, Academic use only disclaimer, data quality assessment, Clinicians double check, Continuous performance evaluation, Audit trail, Continuous usability test, Review of retrospective/simulated cases, Bias check, eXplainable AI, Encryption and use of field-tested libraries, and Semantic interoperability. Our intention here is to provide specific high-level specifications of technical solutions to ensure continuous good performance and use of AI systems to benefit patients in compliance with the future EU regulatory framework.
Cloud Render Farm Services Discovery Using NLP And Ontology Based Knowledge Graph
Annette, Ruby, Banu, Aisha, Priya, Sharon, Chandran, Subash
Cloud render farm services are the Platform-as-a-Service (PaaS) type of cloud services that provide their cloud resources and the complete platform to render the animation files [1, 2]. The animation files to be rendered are uploaded onto the Cloud render farm servers using the web interface of the service provider [3, 4]. The uploaded files are assessed by the rendering job queue manager and the render nodes are assigned for completing the rendering job. The updates on the rendering process are displayed in the render management software dashboard and the user has the privilege to monitor, stop or pause the rendering job and pay only for the rendering time for which the cloud render nodes were used. Hence, the cloud render farm services are considered to be a costeffective alternative for rendering needs in other fields like Fashion designing include Renderingfox, RenderRocket, Rebusfarm etc [5, 6]. Many of our previous work have been focussed on creating a cloud broker service [7,8,9] to aggregate the information about the cloud renderfarms to recommend the right cloud renderfarm services and that let to the realization of the significance of an ontology of cloud renderfarm services to discover and recommend the right cloud renderfarm services. Though many have worked on cloud rendering [10,11,12] and also towards developing an ontology-based service discovery engine for the generic IaaS (Infrastructureas-a-Service) like Sim KM, et al [13,14,15], no work has considered developing domain specific service discovery engine for cloud render farm services of PaaS (Platform-as-a-Service) type. This research work proposes ontology-based domain specific service discovery engine named RenderSearch for the cloud render farm services of PaaS (Platform-as-a-Service) type. The contributions of this research work include the following: i) This work proposes service discovery engine architecture for domain specific cloud render farm services.